May Prairie State Natural Area (Site 28).
We have species composition data from 1031 1m^2 quadrats representing 27 sites.
Nearly all statistical packages require the data to be in a presence-absence form. There are several ways to do it (one of which can maintain cover values rather than changing it to binary data); I used a loop function. The result is a presence-absence matrix with Site as a column so subsamples can be organized accordingly. This column must be deleted for every analysis, however!
| Site | SCICYP | PROPEC | JUNDEB | JUNREP | CXJOOR | CXLOUI | |
|---|---|---|---|---|---|---|---|
| 1.1.1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 |
| 1.1.2 | 1 | 1 | 1 | 1 | 0 | 0 | 1 |
| 1.1.3 | 1 | 1 | 1 | 0 | 0 | 0 | 1 |
| 1.1.4 | 1 | 1 | 1 | 0 | 0 | 1 | 1 |
| 1.1.5 | 1 | 0 | 0 | 0 | 0 | 0 | 1 |
Hill numbers are a family of diversity indices that combine relative abundance and species richness in various ways. The most common ones are Shannon and Simpson diversity indices, though species richness itself is also technically a Hill number!
Each site was split into its own dataframe, then had that pesky “site” column removed. We then need to unite them all into a list (object). We then must use the function “as.incfreq”, which is part of the iNEXT package, to convert the data into a special incidence frequency list that the package expects.
#Make a list of all site dataframes
site.list.all = list(S01=S01,S02=S02,S03=S03,S04=S04,S05=S05,S06=S06,S07=S07,S08=S08,S09=S09,S10=S10,S11=S11,S12=S12,S13=S13,S14=S14,S16=S16,S17=S17,S18=S18,S19=S19,S20=S20,S21=S21,S22=S22,S23=S23,S24=S24,S25=S25,S26=S26,S27=S27,S28=S28)
#Convert everything in list to incidence frequencies
site.list.freq.all = lapply(site.list.all, as.incfreq)Finally we use the iNEXT command to work its magic and create lots of curves based on resampling the data over and over in random order. If we change the q argument to 1 or 2, the output will return curves for Shannon and Simpson Diversity Indices respectively. We want richness, so we’ll leave it at 0. Nboot is the number of resample permutations (bootstraps); 10,000 is a lot. Note that the output must be plotted with iNEXT’s special system based on ggplot (creatively named ggiNEXT); most syntax is the same.
out.inc.all<-iNEXT(site.list.freq.all, q=0, datatype="incidence_freq", nboot=10000)ggiNEXT(out.inc.all, type=1, color.var="site") +
theme_bw(base_size = 18) +
ylab("Species Richness") +
xlab("Number of Quadrats") +
ggtitle("Projectile unicorn vomit") +
labs(caption="B=10000.")
This plot is not particularly helpful other than to visualize the general span of observed and expected richnesses and sampling efforts.
We’ll split up the plots to visualize the curves in more manageable doses. Note that sites were plotted for spatial efficiency rather than any logical order, and that each plot uses different scales.
Other than using it to make species accumulation curves, we can use the iNEXT object to extract richness and diversity values. This object also outputs standard error as well as lower and upper estimates for each value (these were ommitted in the table below for brevity).
| Site | RichnessObs | RichnessExp | ShannonObs | ShannonExp | SimpsonObs | SimpsonExp | |
|---|---|---|---|---|---|---|---|
| 1 | S01 | 33 | 38.387 | 20.533 | 22.985 | 15.633 | 16.636 |
| 6 | S06 | 22 | 32.407 | 11.445 | 12.431 | 8.769 | 8.957 |
| 7 | S07 | 68 | 102.908 | 52.888 | 72.166 | 43.263 | 54.541 |
| 8 | S08 | 91 | 119.520 | 67.075 | 82.634 | 52.805 | 60.756 |
| 10 | S10 | 51 | 153.307 | 20.113 | 25.348 | 10.620 | 10.853 |
| 26 | S27 | 24 | 24.391 | 16.522 | 17.776 | 11.469 | 12.061 |
| 27 | S28 | 132 | 187.467 | 88.698 | 106.193 | 66.864 | 73.784 |
Let’s visually interpret how the observed richness compares with the expected richness.
Oh. Yikes. Some of those look pretty rough. Let’s compare completion percentage (observed/expected) with sampling effort (# quadrats sampled/site radius) to see if some sites were unexpectedly low due to obvious undersampling.
There is no relationship between sampling effort and completion percentage (p=0.6366006). However, note that Sites 10 and 26 were flagged as outliers by the autoplot function. This inadequate sampling is likely the result of too few quadrats sampled at the wetland edge relative to the size of the wetland.
There appears to be a relationship between site richness and site area, but unexpectedly this relationship appears to be negative. Because the data are likely non-linear, a generalized linear model should be used to assess this relationship.